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Abstract 

In  this  paper  we  consider  a  general  formulation  of  a  discrete-time  filtering  problem 
for  descriptor  systems.  It  is  shown  that  the  nature  of  descriptor  systems  leads  directly 
to  the  need  to  examine  singular  estimation  problems.  Using  a  “dual  approach”  to 
estimation  we  derive  a  so-called  “3-block”  form  for  the  optimal  filter  and  a  corresponding 
3-block  Riccati  equation  for  a  general  class  of  time-varying  descriptor  models  which 
need  not  represent  a  well-posed  system  in  that  the  dynamics  may  be  either  over-  or 
under-constrained.  Specializing  to  the  time-invariant  case  we  examine  the  asymptotic 
properties  of  the  3-block  filter,  and  in  particular  analyze  in  detail  the  resulting  3-block 
algebraic  Riccati  equation,  generalizing  significantly  the  residts  in  [23,  28,  33].  Finally, 
the  noncausal  nature  of  discrete-time  descriptor  dynamics  implies  that  future  dynamics 
may  provide  some  information  about  the  present  state.  We  present  a  modified  form  for 
the  descriptor  Kalman  filter  that  takes  this  information  into  account. 
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1  Introduction 


In  this  paper,  we  address  the  problem  of  recursive  estimation  for  a  general  class  of  descriptor 
systems.  Specifically  the  systems  that  we  consider  are  of  the  form^; 


EkJrix{k  +  l)  =  +  u(A:),  A;  >  0 

t/(A:  +  l)  =  Ck+ix{k  +  1)  +  r{k),  k>Q 


(1.1) 

(1.2) 


where  the  matrix  Ek+i  is  Ik  x  Uk+i,  Ak  is  Ik  x  nk,  and  Ck+i  is  pk+i  x  Uk+i-  Here  u  and  r 
are  zero-mean  white  Gaussian  noise  sequences  with 
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u{k) 
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uu)  Y' 
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(1.3) 


where  M.{.)  denotes  the  mean  and  S{k)  =  liffc  =  0  and  0  otherwise.  Also  we  assume 
that  ®(0)  is  a  Gaussian  random  variable  independent  of  {u^{k),r^{k))^),  with  mean  xo 
and  covariance  Pq,  independent  of  (u^(fe),  r^(fe))^.  The  problem  that  we  consider  in  this 
paper  is  the  recursive  estimation  of  x{k). 

Several  aspects  of  this  model  deserve  comment.  The  study  of  descriptor  systems,  of 
course,  has  a  rich  and  growing  literature  [9,  10,  18].  Some  of  the  motivation  for  this  activity 
comes  from  applications  in  which  the  natural  descriptions  of  systems  involve  both  dynamics 
and  constraints  among  variables,  leading  to  models  of  the  form  (1.1)  with  a  possibly  singular 
matrix  E  on  the  left-hand  side.  Furthermore  in  studies  such  as  [35]  it  is  argued  that  models 
of  this  type  are  a  natural  starting  point  for  modeling  when  we  are  attempting  to  deduce 
relations  among  dynamically  evolving  quantities  rather  then  imposing  causative  structure. 
Indeed  in  [19]-[21]  as  well  as  in  our  previous  work  [23]-[29],  it  has  been  emphasized  that 
descriptor  models  such  as  (1.1)  can  be  used  to  describe  noncausal  phenomena  -e.g.  where 
the  variable  represents  space  rather  than  time-  which  typically  involve  such  dynamics 
together  with  boundary  conditions.  In  fact  in  a  subsequent  paper  [26]  the  results  that  we 
develop  here  ^^re  used  for  constructing  efficient  smoothing  algorithms  for  such  boundary- 
value  models. 

A  second  point  to  note  concerning  the  model  (1.1)  is  that  it  allows  the  possiblity  that 
the  dimensions  of  the  problem-  the  number  Ik  of  dynamic  constraints,  the  number  pk  of 
measurements,  and  the  dimension  of  x{k)-  may  vary  with  k.  As  we  shall  see,  this  does 
not  cause  any  difficulty  in  our  analysis,  but  this  is  not  our  reason  for  including  this  level  of 
generality.  A  better  reason  is  that  such  a  situation  arises  naturally  in  “recursive”  descrip¬ 
tions  of  two-dimensional  (2-D)  phenomena.  Specifically,  as  shown  in  [16],  there  are  very 
natural  directions  of  recursion  for  boundary- value  models,  namely  in  from  or  out  towards 
the  boundary.  The  inward  propagation,  for  example,  involves  propagating  boimdary  con¬ 
ditions  into  the  domain  of  interest.  In  1-D  problems,  where  the  bormdary  consists  of  two 
points,  inward  propagation  leads  to  a  new  boundary  which  has  also  two  points.  In  2-D  how¬ 
ever,  the  boundary  of  a  compact  domain  changes  size  as  we  shrink  or  expand  the  domain. 

^Tlie  indexing  clioices  in  (Ll)-(1.2)  liaA'e  been  made  in  part  to  simplify  the  subsequent  development. 
For  example,  the  use  of  the  notation  of  r(k)  in  (1.2)  rather  than  r{k  -f  1)  is  consistent  with  and  .simplifies 
(1.3).  Specifically  with  these  choices,  the  noises  u{k)  and  r(k)  are  possibly  correlated,  and  both  aft'ect  the 
information  we  acquire  on  x(k  -f-  1). 
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Thus  if  we  think  of  x(k)  as  representing  the  values  of  a  process  along  such  a  shrinking  or 
expanding  boundary,  we  see  that  we  have  no  choice  but  to  deal  with  changing  dimensional¬ 
ity.  While  we  do  not  explicitly  focus  on  such  problems  in  this  paper,  our  analysis  contains 
all  the  elements  necessary  to  make  it  directly  applicable  to  2-D  estimation  problems. 

Another  reason  for  allowing  the  possibility  of  changing  dimensionality  is  one  that  has 
other  even  more  important  implications.  Specifically,  rather  than  thinking  of  (1.1)  as  de¬ 
scribing  system  dynamics,  we  may  wish  to  think  of  it  as  providing  a  set  of  possibly  noisy 
constraints  on  the  behavior  of  x.  Prom  this  perspective  the  information  in  (1.1)  plays  es¬ 
sentially  the  same  role  as  that  in  (1.2),  the  only  apparent  difference  being  that  each  piece 
of  information  in  (1.1)  concerns  x  at  two  consecutive  points  in  time  rather  than  the  single¬ 
point-in-time  nature  of  the  information  (1.2).  From  this  perspective  allowing  a  change  in 
dimensionality  corresponds  to  allowing  the  possibility  that  at  some  points  in  time  we  might 
have  more  pieces  of  information  than  at  others.  Also,  this  perspective  opens  the  question 
of  the  order  in  which  these  pieces  of  information  are  incorporated.  For  the  most  part  in 
this  paper  we  will  use  the  obvious  ordering,  namely  we  use  (1.1)  through  time  k  to  estimate 
x(k).  However  there  arc  other  possibilities.  In  particular  we  also  consider  in  this  paper  the 
use  of  (1.1)  over  the  entire  time  interval  of  interest,  together  with  (1.2)  through  time  k  to 
estimate  x(k).  As  we  will  see,  when  Ek  =  J,  there  is  no  difference  between  these  two  cases, 
but  there  is  a  difference  when  one  considers  the  case  of  Ek  singular,  again  emphasizing 
the  noncausality  of  such  models.  Furthermore,  the  possible  singularity  of  Ek  coupled  with 
the  interpretation  of  (1.1)  as  an  additional  source  of  “measurement”  data,  leads  directly  to 
the  need  to  consider  the  possibility  that  some  “measurements”  are  perfect.  Thus,  in  our 
formulation  we  allow  the  possibility  that  Rk  in  (1.3)  is  singular. 

Recursive  estimation  for  descriptor  systems  has  been  the  subject  of  several  studies  in 
recent  years  [8,  13,  23,  24,  28,  33].  In  particular  in  [24]  we  addressed  this  problem  in  the 
context  of  optimal  smoothing  for  well-posed,  constant  coefficient  boundary- value  descriptor 
systems,  i.e.  systems  of  the  form  (1.1)-(1.2)  which  are  square  and  constant  (i.e.  Ik  = 
Uk  =  n,  pk  =  p,  and  aU  matrices  are  constant),  together  with  both  the  assumption  that 
{E,  A}  is  a  regular  pencil  and  a  set  of  boundary  conditions  which  yield  a  unique  solution 
to  (1.1)  for  any  input  u{k).  In  that  paper  we  used  the  method  in  [1]  to  derive  a  2n  x  2n 
Hamiltonian  (bormdary-value  descriptor)  system  for  the  optimal  smoother  assuming  also 
that  the  measurement  noise  covariemce  R  was  nonsingular.  In  addition,  we  introduced  a 
new  generalized  algebraic  Riccati  equation  and  showed  that  if  a  solution  existed  to  this 
equation,  the  Hamiltonian  dynamics  could  be  decoupled  leading  to  parallel  forward  and 
backward  recursions  reminiscent  of  the  Mayne- Fraser  smoothing  algorithm.  In  subsequent 
work  in  developing  a  system  theory  for  such  systems  we  obtained  a  set  of  necessary  and 
sufficient  conditions  for  the  existence  and  uniqueness  of  positive  definite  solutions  for  this 
class  of  generalized  Riccati  equations  [23,  28]  and  also  provided  a  statistical  interpretation 
for  this  solution.  More  recently  Wang  and  Bernhard  [33]  have  developed  some  closely-related 
results  by  dualizing  their  work  on  optimal  control  for  descriptor  systems  [7].  Because  of 
this  perspective,  less  attention  was  paid  to  statistical  interpretations  of  the  restilts,  and  also 
their  approach  deals  with  estimating  Ex{k)  rather  than  x{k).  On  the  other  hand,  Wang 
and  Bernhard  consider  the  more  general  case  in  which  the  pencil  {E,  A}  need  not  be  regular 
and  in  fact  may  not  even  be  square  (so  that  I  ^  n)  and  in  this  context  develop  analogous 
results  on  filter  convergence  and  Riccati  equations  to  those  in  [23,  28]. 
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While  the  restriction  to  the  estimation  of  Ex{k)  is  not  significant  if  E  is  invertible,  it 
is  substantive  if  E  is  singular.  Furthermore,  as  we  have  hinted,  the  possible  singularity  of 
E  an  A,  together  with  the  objective  of  estimating  all  of  x{k),  leads  directly  to  the  need 
to  consider  the  possibility  of  perfect  “measurements”  either  through  the  dynamics  (1.1) 
or  the  observations  (1.2).  In  this  paper,  we  develop  a  procedure  for  optimal  recursive 
estimation  that  is  valid  in  the  most  general  framework-  with  E  and  A  not  necessarily 
square  nor  invertible  and  with  possibly  singular  measurement  noise  covariances.  As  we  will 
see,  considering  such  a  problem  leads  to  the  introduction  of  what  we  refer  to  as  “3-block” 
forms  for  Hamiltonians,  filters,  and  Riccati  equations.  Such  forms  actually  can  be  found 
in  various  contexts  in  a  number  of  papers  in  estimation  and  control  [3,  4,  5,  14,  22,  32, 
34].  Our  work  builds  most  directly  on  the  approach  of  Whittle  [34],  Chapter  11,  and  the 
machinery  for  singular  estimation  in  Campbell  and  Meyer  [11]  to  derive  not  only  a  new 
3-block  Hamiltonian  form  valid  in  our  general  context  but  also  a  new  3-block  generalized 
Riccati  equation.  In  addition  in  the  constant  dimension/constant  matrix  case  we  develop 
convergence  and  steady-state  results  for  the  algebraic  version  of  this  equation,  thereby 
extending  the  earlier  results  in  [23,  28,  33]. 

In  the  next  section  we  present  and  review  some  of  the  basic  concepts  concerning  max¬ 
imum  likelihood  parameter  estimation  with  particular  emphasis  on  deriving  a  form  that  is 
valid  when  some  of  the  measurements  are  perfect.  These  results  allow  us  in  Section  3  to 
address  the  filtering  problem  for  the  system  (1.1)-(1.2)  resulting  in  the  3-block  form  for 
the  descriptor  Kalman  filter  and  a  corresponding  3-block  Riccati  equation.  In  Sections  4 
and  5  we  then  focus  on  the  time-invariant  case.  In  the  first  of  these  sections  we  generalize 
the  results  in  [23,  28,  33]  by  studying  in  detail  the  asymptotic  properties  of  the  descriptor 
Kalman  filter.  In  particular  we  provide  conditions  for  filter  stability  and  for  the  convergence 
of  the  solution  to  the  Riccati  equation.  Conditions  under  which  the  resulting  3-block  alge¬ 
braic  Riccati  equation  has  a  unique  positive  semi-definite  solution  are  given,  and  in  Section 
5  we  generalize  the  well-known  eigenvector  approach  to  solving  standard  Riccati  equations 
[31,  32]  to  OUT  3-block  form.  Finally,  in  Section  6  we  show  how  the  estimation  procedure 
we  have  developed  can  be  modified  to  account  for  the  information  about  x{k)  contained  in 
the  future  dynamic  constraints. 

2  A  Look  at  Maximum  Likelihood  Estimation 

In  this  section  we  examine  a  few  features  of  maximum  likelihood  (ML)  linear  estimation 
beginning  with  the  simple  problem  of  estimating  an  miknown  n-dimensional  vector  x  based 
on  the  p-dimensional  measurement  vector 

y  =  HxA  V,  (2.1) 

where  u  is  a  zero-mean  Gaussian  random  vector  with  covariance  R.  While  this  is  a  well- 
studied  problem,  it  is  worth  making  a  few  comments  about  it.  First,  note  that  the  study 
of  this  estimation  problem  actually  includes  least-squares  Bayesian  estimation  for  Gaussian 
vectors.  Specifically,  consider  the  problem  of  computing  the  least-squares  estimate  of  a 
Gaussian  random  vector  x  with  mean  m  and  covariance  P  based  on  the  measurement 
vector  z  =  Cx-\-  n  where  n  is  zero-mean  Gaussian,  independent  of  x,  with  covariance  N .  It 
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is  straightforward  to  check  that  this  problem  yields  the  same  estimate  as  the  ML  problem 
with 

»=(T)'  0  w)- 


(2.2) 


We  focus  here  on  the  ML  viewpoint,  which  in  the  next  section  will  lead  to  our  interpreting 
dynamic  constraints  as  in  (1.1)  as  additional  pieces  of  information  or  measurements.  A 
second  point  is  that  we  focus  here,  for  the  most  part,  on  the  case  in  which  x  is  estimable,  i.e. 
in  which  (2.1)  provides  sufficient  constraints  so  that  we  can  in  fact  estunate  all  components 
of  X.  This  is  equivalent  to  assuming  that  H  has  rank  n  =  dim(a:),  which,  for  example,  is 
aways  true  in  the  Bayesian  case  (i.e.  H  in  (2.2)  obviously  has  rank  n). 

The  third  and  most  importemt  point  for  us  is  that  we  wish  to  consider  ML  problems 
where  R  may  not  be  of  full  rank.  If  H  and  R  have  full-rank,  the  solution  to  the  ML  problem 
is  easy  to  write  out  explicitly: 


xml  =  {H^R-^H)-^H^R-^^ 


(2.3) 


(2.4) 


The  error  variance  associated  with  this  estimate  is  given  by 

Pml  =  M\{x  -  xml){x  -  xml)^]  =  {H^R~^H)~^. 

The  fact  that  the  calculations  can  be  described  in  such  explicit  form  is  extremely  im¬ 
portant  as  it  allows  us  to  obtain  an  explicit  recursive  structure  for  sequential  estimation 
problems.  What  we  would  like  to  do  is  to  obtain  an  equally  explicit  form  when  R  is  singular. 
To  do  this,  we  begin  by  recasting  the  ML  estimation  problem  as  a  quadratic  minimization 
problem.  This  approach,  described  by  Whittle  [34]  and  by  Campbell  and  Meyer  [11],  can 
also  be  traced  to  early  optimal  control-based  derivations  of  the  Kalman  filter  such  as  in  [6]. 

Let  F  be  a  full-rank  square  root  of  R  (so  that  FF^  =  R).  Then  we  can  write  v  =  Vw, 
where  tu  is  a  zero-mean  Gaussian  random  vector  with  covariance  I.  If  we  then  view  the 
measurement 

y  =  Hx  +  Vw  (2.5) 

as  a  linear  constraint  on  x  and  w,  the  ML  problem  is  simply  one  of  finding  a  pair  {x,w) 
satisfying  (2.5)  and  maximizing  the  probability  density  of  w  or,  equivalently,  minimizing 

J{w)  =  {l/2)w^w.  (2.6) 

This  problem  is  readily  solved  using  the  Lagrange  multipliers.  Specifically,  let 

L{w,x,X)  =  {l/2)w^w  -f  X^{y  —  Hx  —  Vw). 

Setting  the  partials  with  respect  to  u;,  a:,  and  A  to  zero  yields 

w-V^X  =  0 
H^X  =  0 
y  —  Hx  —  Vw  =  0. 

Using  (2.8a)  to  eliminate  w  gives  the  2-block  (p-f  n)-dimensional  set  of  equations 


(2.7) 

(2.8a) 

(2.8b) 

(2.8c) 


R  H  \  X 
0  /  I  a: 


(2.9) 
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A  first  obvious  question  about  this  set  of  equations  concerns  the  invert ibility  of  the 
(p4-  n)  X  (p+  n)  matrix  in  (2.9).  Note  that  one  obvious  necessary  condition  is  that  H  must 
have  full  column  rank,  as  otherwise  the  last  n  columns  would  not  be  linearly  independent.  A 
second  immediate  necessary  condition  is  that  the  first  p  rows  must  be  linearly  independent. 
The  following  shows  that  this  pair  of  conditions  is  also  sufficient. 

Lemma  2.1  Let  R  be  positive  semi-definite  and  H  a  full  column  rank  matrix.  Then,  if 
[jR  H]  has  full  row  rank,  the  matrix 

(  0  ) 

is  invertible. 

Proof;  Suppose  that 

^)=o. 

Then 

x'^R  +  =  0 

and 

x^H  =  0.  (2.13) 

If  we  now  take  the  transpose  of  (2.13)  and  multiply  it  on  the  left  by  we  get 

y'^H^x  =  0,  (2.14) 


(2.11) 

(2.12) 


which  after  substitution  in  (2.12)  postmultiplied  by  x  yields 

x^Rx  =  0,  (2.15) 


which  since  R  is  positive  semi-definite  gives  x^R  =  0.  Together  with  (2.13),  this  yields 
x^[R  H]  =  0,  which  implies  a:  =  0,  since  [R  H]  has  full  row  rank.  Then  (2.12)  implies 
that  y^H^  =  0,  and  since  H  has  full  rank  we  have  y  =  0,  so  that  (2.10)  is  invertible.  □ 


Assuming  that  (2.10)  is  invertible,  we  have  from  (2.9)  that 

0  , 


®ikfi 


(  ^ 

H  \  ^  ( 

(o  /) 

(  RT 

0 

(2.16) 


and  by  direct  calctdation  we  find  that  xml  is  unbiased  and  has  for  error 


covariance 


Writing 

(  R  Q  \  _  (  R  Q  \  _(  0  0\ 

\^ooj  0  ) 


(2.17) 


(2.18) 
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we  obtain  the  following  simpler  expression  for  Pml' 

=  j)(/^  (j).  (2,19) 

The  condition  that  [R  H]  has  full  column  rank  has  a  simple  physical  interpretation, 
namely  that  there  are  no  redundant  perfect  measurements  (i.e.  that  no  independent  linear 
combinations  of  the  observations  yield  noise  free  measurements  of  the  same  linear  combina¬ 
tion  of  components  of  a:).  While  this  would  certainly  seem  to  be  a  reasonable  assumption 
and  can  in  principle  be  enforced  by  identifying  redundancies  and  eliminating  them,  it  is 
convenient  to  have  a  result  that  applies  even  when  (2.10)  is  not  invertible^.  In  this  case, 
as  one  might  expect,  it  is  necesseiry  to  use  pseudo-inverses.  As  discussed  in  [11],  there  are 
various  sets  of  properties  one  can  impose  in  defining  pseudo-inverses.  For  our  purposes 
here  it  suffices  to  take  the  pseudo-inverse  of  a  symmetric  matrix  Z  to  be  any  synunetric 
matrix  for  which 

ZZ^Z  =  Z  (2.20) 

(this  is  what  is  referred  to  in  [11]  as  a  (l)-inverse).  Then  we  have  the  following: 


Lemma  2.2  Suppose  that  H  has  full  column  rank.  The  ML  estimate  of  x  based  on  the 
measurement  vector  (2.1)  is  given  by 


^ML 


=(o  /)( 


R  H 
0 


This  estimate  is  unbiased  and  has  for  error  covariance 


Pml  =  “  (  ®  -^  ) 


R 


(2.21) 


(2.22) 


This  restdt  is  proved  in  [11],  although  in  our  case  we  can  say  a  bit  more.  Specifically, 
let 

Then  in  the  case  where  H  has  full  column  rank,  T  is  unique,  while  W  and  U  will  not  be 
unless  (2.10)  is  invertible.  Note  that  from  (2.21) 

=  U'^y  (2.24) 


so  that  the  gains  in  the  ML  estimator  may  not  be  unique-  reflecting  the  fact  that  there 
are  nonunique  ways  in  which  to  determine  certain  linear  combinations  of  components  of  x 
exactly.  On  the  other  hand  the  resulting  error  covariance  shotdd  be  unambiguously  defined, 
and  from  (2.22),  (2.23)  we  see  that  it  is,  since  Pml  =  -T.  We  refer  the  reader  to  Appendix 

'^Indeed  while  it  may  be  easy  to  keep  t  rack  of  and  eliminate  redundancies  in  a  given  set  of  ineasurcniculs, 
it  is  more  difficult  to  do  this  in  an  organized  and  ea.sily  expressible  way  when  those  redundancies  may  e\'olve 
dynamically  and  arise  through  the  dynamic  constraint.s  a.s  well  as  the  measurements. 


7 


A  for  a  summary  of  several  of  the  calculations  and  some  results  from  [11].  In  particular,  we 
prove  the  identity 


which  is  used  below. 

If  H  does  not  have  full  colunm  rank,  then,  as  we  have  indicated,  x  is  not  estimable 
so  that  the  ML  estimate  of  aU  of  x  is  mdefined.  Nevertheless  in  such  a  situation  various 
linear  combinations  of  x  may  be  estimable  (e.g.  obviously  {xi  +  0:2)  is  estimable  from  the 
observation  y  =  [xi  +  0:2)).  The  precise  definition  of  estimability  given  in  [11]  is  that  the 
linear  combination  of  c^x  is  estimable  if  there  exists  a  measurement  linear  combination 
d^y  that  is  an  tmbiased  estimate  of  c^x.  An  essentially  immediate  necessary  and  sufficient 
condition  for  this  is  that  c  must  be  in  the  range  of  Let  r  denote  the  rank  of  H  and  let 
H  =  H1H2  denote  a  full-rank  factorization  of  H,  i.e.  Hi  is  a  p  x  r  full  column  rank  matrix 
and  H2  is  an  r  X  n  full  row  rank  matrix.  Then  it  is  precisely  z  =  H2X  whose  ML  estimate 
can  be  computed  from  y.  Furthermore  from  results  in  [11]  we  can  deduce  that 


R  H  \  (  U 

0  ) I  T 


(2.25) 


ZML 


R  H 
0 


I 

0 


y  (2.26) 


is  an  unbiased  estimate  of  z  with  associated  error  covariance 


Pml  =  “  {  0  ^  ) 


H 

0 


0 


(2.27) 


Next,  we  prove  several  results  that  provide  the  justification  for  the  recursive  procedure 
that  will  be  employed  below  for  computing  ML  estimates.  The  first  of  these  results  states 
that  for  the  purpose  of  estimating  other  variables,  we  can  replace  several  measurements  of 
a  variable  by  its  estimate  based  on  these  measurements. 


Lemma  2.3  Let  x  and  z  be  unknown  vectors  and  consider  the  observations 


a  =  Hx  +  V  (2.28) 

b  =  Jx  +  Kz  -b  w  (2.29) 

where  v  and  w  are  independent,  zero-mean  Gaussian  vectors  with  covariance  matrices  V  and 
W,  respectively.  Suppose  that  x  is  estimable  based  on  (2.28)  only,  and  that  z  is  estimable 
based  on  both  (2.28)  and  (2.29).  Let  xi  be  the  estimate  of  x  based  on  (2.28),  and  let  Pi  be 
the  associated  error  covariance  matrix.  Then  x,  the  estimate  of  x  based  on  both  (2.28)  and 
(2.29),  and  its  associated  estimation  error  covariance  P,  are  identical  to  the  estimate  and 
estimation  error  covariance  resulting  from  estimating  x  from  (2.29)  and  the  observation 

xi  —  X  u  (2.30) 


where  u  is  zero-mean  and  Gaussian,  independent  of  w,  with  covariance  Pi. 

Furthermore,  the  estimate  z  and  estimation  error  covariance  of  z  based  on  (2.28)  and 
(2.29)  are  the  same  as  its  estimate  and  error  covariance  based  on  (2.29)  and  (2.30). 
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Proof:  Since  x  is  estimable  from  (2.28),  H  must  have  full  column  rank.  This,  coupled 
with  the  assumption  that  z  is  estimable  from  both  (2.28)  and  (2.29),  implies  that  K  has 
full  column  rank.  Consider  the  joint  estimation  of  x  and  z  based  on  (2.28)  and  (2.29).  If 
Xa  and  Afr  denote  the  Lagrange  imdtiplier  vectors  associated  with  observations  (2.28)  and 
(2.29),  respectively,  according  to  (2.9)  the  multiplier  vectors  and  estimates  x  and  z  satisfy 
the  system 


(  V  Q  H  0  \ 

^  a  ^ 

0  W  J  K 

-^6 

b 

JT  Q  Q 

X 

0 

o 

o 

o 

\  ^  y 

voy 

We  seek  to  compare  this  “batch”  estimation  method,  where  zdl  the  measurements  are  pro¬ 
cessed  at  the  same  time,  with  the  “sequential”  approach  where  we  first  estimate  x  based 
on  (2.28)  alone,  and  then  combine  the  resulting  estimate  in  the  form  of  the  summary  mea¬ 
surement  (2.30)  with  observation  (2.29)  in  order  to  estimate  both  x  and  z.  If  Ai<j  and  xt 
are  the  Lagrange  multiplier  vector  and  estimate  of  x  based  on  (2.28)  alone,  and  if  Ui  and 
P\  are  the  associated  estimator  matrix  and  error  covariance,  we  see  from  (2.9)  and  from 
identity  (2.25)  that  they  satisfy 


(  V  hWAi„ 

\  0  /  \  -Pi  /  V  0  J  j  • 


(2.32) 


Similarly,  the  estimates  of  x  and  z  based  on  (2.29)  and  the  smnmary  measurement  (2.30) 
are  obtained  by  solving 


/Pi  0  I  0  \ 

( ^ 

j 

^  Xi  '' 

0  W  J  K  \ 

Ai, 

b 

O 

O 

X 

0 

o 

o 

o 

\  ^  y 

V  0  y 

(2.33) 


where  Ai  denotes  here  the  Lagrange  multiplier  vector  associated  to  the  measurement  (2.30). 
What  we  need  to  show  is  that  the  solution  of  systems  (2.32)  and  (2.33)  combined  is  equiv¬ 
alent  to  solving  (2.31).  This  requires  eliminating  Ai^,  ®i  and  Ai  from  (2.32)  and  (2.33)  and 
finding  an  expression  for  the  variable  Aa  of  (2.31)  in  terms  of  these  quantities.  Note  that 
our  choice  of  notation  already  indicates  that  the  variables  z  and  A^  which  appear  in  both 
systems  are  the  same. 

First,  we  observe  that  the  second  and  fourth  block  row  of  (2.31)  already  appear  as  the 
second  and  fourth  block  rows  of  (2.33),  so  that  we  need  only  to  obtain  the  first  and  third 
block  rows  of  (2.31)  from  (2.32)  and  (2.33). 

First  block  row;  We  multiply  the  first  block  row  of  (2.33)  on  the  left  by  H.  This  gives 


PPiAi  +  Hx  =  Hxi. 


(2.34) 


Now  using  the  identities  corresponding  to  the  (1,1)  and  (1,2)  blocks  of  (2.32),  we  find 


FPiA: -I- P®  =  a  -  FAia 


so  that  if  we  denote 


+  UiXi, 


(2.35) 

(2.36) 
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we  get  the  first  block  row  of  (2.31). 

Tliird  block  row:  Consider  the  third  block  row  of  (2.33): 

Ai  +  =  0.  (2.37) 

Substituting  the  identities  corresponding  to  the  (2,1)  and  (2,2)  blocks  of  (2.32)  inside  this 
identity,  we  obtain 

ffjA„  +  jjAt  =  0  (2.38) 

where  Aa  is  defined  as  in  (2.36).  But  (2.38)  is  just  the  third  block  row  of  (2.31). 

Since  the  estimates  of  x  and  z  obtained  by  the  batch  and  sequential  methods  are  identical 
functionals  of  a  and  6,  and  the  statistical  asstmiptions  for  a  and  b  are  the  same  under  both 
methods,  the  error  covariances  are  the  same.  □ 

Note  that  in  the  above  proof,  the  assumption  that  x  is  estimable  from  (2.28)  was  needed 
to  ensure  that  the  second  block  colmnn  of  (2.32)  holds.  On  the  other  hand,  the  assumption 
that  z  is  estimable  from  both  (2.28)  and  (2.29),  i.e.  that  K  has  full  rank,  was  only  required 
insofar  as  we  wanted  to  discuss  the  properties  of  the  ML  estimate  z  and  the  associated  error 
covariance.  It  is  not  necessary  if  we  are  only  interested  in  estimable  linear  combinations  of 
the  entries  of  z,  and  the  above  result  can  easily  be  restated  in  a  way  that  does  not  require 
z  to  be  estimable. 

Lemma  2.3  shows  that  previously  processed  measurements  can  be  aggregated  in  the  form 
of  a  summary  measurement  (2.30)  for  the  variables  that  have  been  estimated.  However  the 
summary  measurement  (2.30)  wiU  include  estimates  of  variables  that  do  not  appear  in 
subsequent  measurements  and  in  which  we  are  no  longer  interested.  Conventional  wisdom 
suggests  that  measurements  associated  to  such  exogenous  variables  can  be  discarded  without 
affecting  the  estimation  of  the  other  variables.  The  following  result,  which  is  expressed  in 
its  most  general  form,  provides  a  criterion  for  dropping  unneeded  measurements. 

Lemma  2.4  Consider  the  observations 

where  xi  and  *2  are  two  unknown  vectors,  and  [vf  is  a  zero-mean  Gaussian  vector 

with  covariance 

(  Rn  Ri2  \ 

\  RT2  R22  )  ■ 

Suppose  that  H3  has  full  row  rank,  and  Hi  has  full  column  rank.  Then  the  ML  estimate  of 
xi  based  on  both  yi  and  t/2  is  the  same  as  that  based  on  yi  alone. 

The  assmnption  that  Hi  has  full  column  rank  is  introduced  here  to  guarantee  that  xi 
is  estimable  from  the  yi  measurement,  but  it  can  be  removed  if  we  only  seek  to  estimate 
estimable  linear  combinations  of  the  entries  of  xi. 

Proof;  Let  Ai  and  A2  be  the  Lagrange  multiplier  vectors  associated  to  the  yi  and  y2 
measurements,  respectively,  and  let  xi  and  X2  be  the  estimates  of  xi  and  X2  based  on  both 
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y\  and  y^-  According  to  (2.9),  they  satisfy  the  system 


(  Ru  Ri2  Hi  0  N 

(  ^ 

^  yi  ^ 

R22  H2  Hz 

A2 

y2 

0  0 

Xi 

0 

\  0  ^3^  0  0  y 

\  J 

1  0  / 

Since  Hz  has  full  row  rank,  the  relation  —  0  hnplies  that  A2  =  0,  so  that  we  can 

delete  the  second  and  fourth  block  rows  and  columns  from  (2.40).  This  gives 


(  ffi  W  Ai 

\ht  0  j  \ 


(2.41) 


which  is  precisely  the  system  corresponding  to  the  ML  estimation  of  xi  (or  estimable  linear 
combinations  of  its  entries)  from  yi  alone.  □ 


The  combination  of  Lemmas  2.3  and  2.4  provides  a  general  mechanism  for  generating 
ML  estimates  rectnsively.  Specifically,  consider  the  two  observations 


a  =  Hxi  +  Gxz  +  v  (2.42) 

b  =  Jxi+Kz  +  w  (2.43) 

where  we  assume  that  xi  and  052  are  jointly  estimable  from  (2.42),  i.e.  [if  G]  has  full 
column  rank,  and  z  is  estimable  from  (2.42)  and  (2.43),  i.e.  K  has  full  coltunn  rank.  As 
in  Lenuna  2.3,  we  assume  that  v  and  w  are  zero-mean  independent  Gaussian  vectors  with 
covariance  V  and  W,  respectively.  From  Lemma  2.3,  we  see  that  the  measmrement  (2.42) 
can  be  replaced  by  the  summary  measurements 


+  «i  (2.44a) 

=  *2  +  «2  (2.44b) 

where  xi  and  £2  denote  the  ML  estimates  of  xi  and  X2  based  on  (2.42)  alone,  and  the  covari¬ 
ance  of  [uf  is  the  corresponding  estimation  error  covariance.  Then,  the  ML  estimates 
of  *1  and  z  based  on  both  (2.43)  and  (2.44a)-(2.44b)  are  the  same  as  those  derived  from 
(2.42)  and  (2.43).  But  X2  does  not  appear  in  observation  (2.43),  and  the  system  obtained 
by  combining  (2.43)  and  (2.44a)-(2.44b)  satisfies  the  assmnptions  of  Lenuna  2.4,  so  that  for 
the  purpose  of  estimating  x\  and  z  we  can  drop  the  measurement  (2.44b).  This  shows  that 
the  ML  estimates  of  x^  and  2  based  on  both  (2.43)  and  the  summary  measurement  (2.44a) 
are  the  same  as  those  based  on  (2.42)  and  (2.43).  This  general  procedure,  whereby  previ¬ 
ously  estimated  variables  are  replaced  by  srmmiary  measurements,  and  irrelevant  variables 
are  discarded,  constitutes  the  basis  for  the  descriptor  Kalman  filtering  method  of  Section  3. 

Finally,  we  prove  the  intuitively  obvious  fact  that  for  a  given  measurement  set,  if  the 
noise  covariance  increases,  the  error  covariance  of  the  ML  estimate  also  increases. 


Lemma  2.6  Consider  the  observations 


where  vi  and  V2  are  zero-mean  Gaussian  vectors  with  covariances  Vi  and  V2 ,  and  suppose 
that  H  has  full  column  rank.  Then,  if  V2  >  Vi,  the  estimation  error  variance  associated 

with  estimating  x  based  on  (2.45a)  is  less  than  or  equal  to  the  estimation  error  variance  for 

estimating  z  based  on  (2.45b). 

Proof:  Let  xi,  Pi  and  Z2,  P2  denote  the  ML  estimates  and  estimation  error  covariances 
associated  with  (2.45a)  and  (2.45b),  respectively. 

As  shown  in  Appendix  A 

h  =  U^y2,  P2  =  UfV2U2  (2.46) 

where  Uj  is  a  left-inverse  of  H  satisfying  some  additional  conditions.  Consider  then  the 
following  estimate  of  xi 

x  =  U^yi.  (2.47) 

Since  H  =  I,  this  is  an  unbiased  estimate;  however  it  may  be  suboptimal.  Thus 

Pi  <  M[{x  -  2)(a;  -  xf]  =  U^ViU2  <  U^V2U2  =  P2.  (2.48) 

□ 


3  The  Descriptor  Kalman  Filter 


We  are  now  in  a  position  to  consider  the  recursive  estimation  problem  for  the  system  (1.1)- 
(1.3).  As  we  have  indicated,  we  adopt  here  an  ML  perspective,  viewing  the  dynamics  (1.1) 
and  prior  density  on  a;(0)  as  additional  measurements.  Specifically,  we  describe  in  this 
section  the  recursive  computation  of  the  filtered  estimate  x{j)  which  we  define  as  the  ML 
estimate  of  x(J)  based  on  the  “measurements”  in  (1.1)  and  (1.2)  for  fe  =  0,  ...,j  —  1  together 
with  the  “measurements”  provided  by  the  prior  information  about  a:(0): 


a:o  =  a;(0)  -f  1/ 


(3.1) 


where  n  is  zero-mean,  Gaussian  and  independent  of  u(k)  and  r(k),  and  with  covariance  Pq. 
Specifically,  (3.1)  and  (1.1),  (1.2)  for  k  =  0,...,J  -  1  provide  us  with  a  set  measurements 
of  the  unknown  vector  [®^(0),  ®^(1), ...,  *^(i)].  Examining  this  set  of  measurements  we  see 
that  the  only  terms  in  these  equations  involving  x(j)  are  of  the  form  Ejx(j)  and  Cjx{j). 


Thus  a  necessary  condition  for  x{j)  to  be  estimable  is  that 


have  full  colunm  rank. 


By  induction,  using  the  recursive  ML  estimation  procedure  outlined  in  Section  2,  and  the 
fact  that  a:(0)  is  estimable  from  (3.1),  we  can  show  this  is  also  a  sufficient  condition  for 
estimability  and,  in  fact,  we  can  establish  the  following: 


Lemma  3.1  Let  Pj  denote  the  error  covariance  associated  with  the  filtered  estimate  x{j), 
with  ®(0)  =  xq  and  Pq  given  by  the  prior  distribution  for  a!(0).  Then  x{j  1)  and  Pj+1 

are  respectively  equal  to  the  ML  estimate  of  x{j  -f  1)  and  its  associated  estimation  error 

covariance  based  on  the  following  observations 

y{j  +  1)  =  Cj+ix{j  -I- 1)  +  r{j)  (3.2) 

Aj^j)  =  Ej+ix{j  +  1)  +  Ajv{j)  -  u{j)  (3.3) 
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where  i/(j)  is  a  Gaussian  random  vector,  independent  ofr[j),  with  zero  mean  and  variance 

Pi- 


Applying  Lemma  2.1  to  (3.2),  (3.3)  provides  us  with  the  3-block  form  of  the  descriptor 
Kahnan  filter  smmnarized  in  the  foUowmg: 


Theorem  3.1  The  filtered  estimate  x{j  -f  1)  and  the  corresponding  error  variance  P,+i  can 
be  obtained  from  the  following  recursions: 


®(i  +  1)  = 


■Pj+i  = 


.  1 

/  AjPjAj  +  Qi 

-Si 

Pi+i 

/  AjX{j)  ^ 

o 

o 

-Sf 

Ri 

2/(i  +  i) 

'  1 

0  y 

\  0  ; 

-  j 

/  AjP^Aj  +  Qj 

Pi+1 

\ 

t 

'  0  ^ 

0  J 

-Sf 

Ri 

Chi 

0 

'  ] 

V 

0 

J 

\n 

(3.4) 


(3.5) 


If  past  and  present  observations  and  dynamics  do  not  supply  redundant  perfect  informations, 
i.e.  when 

(AjPjAj  +  Qj  Sj  EJ+^\ 

\  Sf  Rj  Cj+r  j 

has  full  row  rank,  then  the  pseudo-inverse  in  (3.4),  (3.5)  is,  in  fact,  an  inverse. 


Equation  (3.4)  is  the  3-block  form  of  the  optimal  filter,  with  the  Hamiltonian  matrix 
being  the  3-block  matrix  whose  pseudo-inverse  is  taken  on  the  right-hand  side.  Van  Dooren 
[32]  introduced  a  similar  Hamiltonian  form  in  the  case  in  which  E  =  I  and  R  is  invertible 
(so  the  Hamiltonian  matrix  is  certainly  invertible)  and,  by  a  limiting  argument  suggested 
that  it  was  edso  valid  for  R  singular.  Whittle  [34],  Chapter  11  essentially  shows  this  using 
a  dual  optimization  approach  as  in  (2.7)-(2.9)  but  applied  to  the  dynamic  problem  directly. 
Arnold  and  Laub  [3]  also  consider  such  a  form  for  R  invertible  and  E  ^  I  but  invertible. 
Perhaps  closer  to  our  work  is  that  of  Mehrmaim  [22]  who  considers  a  related  Hamiltonian 
pencil  (which  we  encounter  in  Section  5)  for  the  case  of  E  and  R  singular  in  the  context  of 
an  optimal  control  problem  for  singular  systems.  However,  no  recursion  of  the  type  (3.4)  is 
presented  in  [22]  and,  even  more  importantly,  the  3-block  descriptor  Riccati  equation  (3.5) 
is  not  obtained  in  any  of  these  references.  For  completeness,  we  note  that  Bender  and  Laub 
[4,  5]  do  introduce  a  Riccati  equation  for  the  case  of  E  singtilar  in  the  context  of  an  optimal 
control  problem.  However  theirs  is  a  reduced-order  Riccati  equation  of  size  equal  to  the 
rank  of  E,  rather  than  the  full-order  Riccati  equation  (3.5)®. 

Finally,  we  remark  that  the  previously  derived  results  referred  to  in  the  introduction  are 
indeed  specializations  of  (3.4),  (3.5).  In  particular,  Wang  and  Bernhard  [33]  examine  the 

{  E  \ 

case  of  time-invariant  systems,  where  it  is  assiuned  not  only  that  j  ^  1  has  full  column 


^For  (liis  reason  we  conjecture  that  there  is  a  connectioji  between  the  estimation  rlual  of  Bender  and  Laid) 
[4,  5]  and  the  work  in  [.33]  in  which  the  focus  is  on  estimating  Ex{k}.  This  is  only  of  tangential  interest  here 
and  thus  is  not  pursued. 
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rank,  but  also  that  E  Q  J  has  full  row  rank,  5  =  0,  and  M  is  invertible.  Their  analysis 
can  be  interpreted  as  estimating  Ex(j)  —  Ax{j  —  1)  +  u{j  —  1)  based  on  observations  y{k) 
for  0  <  k  <  j  —  1.  In  this  case,  they  obtain  the  following  Riccati  equation  for  the  error 
covariance  "Ej  of  this  one-step  predicted  estimate: 

=  (  0  a)  Mr'  (  )  Mr*  (  jr  )  +  0  (3.6) 

with 

Mj  = 

To  see  how  (3.6)-(3.7)  can  be  obtained  from  (3.5),  note  that 

Ej  =  APjA'^  -t-  Q  (3.8) 


Ej  E 
ET  -C^R-^C 


(3.7) 


and  thus  using  (3.5)  and  the  fact  that  in  Wang  and  Bernhard’s  case  past  and  present 
observations  and  dynamics  do  not  supply  redundant  perfect  information,  we  obtain  the 
following  recursion  for  Ey. 

I  0 


■‘3+1 


=-(o  0  a) 


0 

A^ 


+  Q- 


ROC 
0  Ej  E 
E'^  0 

Then  the  use  of  standard  block-matrix  inversion  results  allows  us  to  express  (3.9)  as 


(3.9) 


=  -(0  ^)Mr>^ 


(3.10) 


where  Mj  is  defined  as  in  (3.7).  It  turns  out  that  (3.10)  is  a  simplified  version  of  (3.6).  To 
obtain  (3.10)  from  (3.6),  note  that 


-'i+i 


(« 

|Mr'( 

(« 

IMrl  , 

/ 

(o 

(o 

IMr'( 

(o 

IMri( 

(« 

Mr>( 

0 

at 


+  Q 


Ma 


0  E 
ET  2CTr-^C 


Mi-1 


0 

at 


+  Q 


Mri 


0 

AT 


0  0  \  nrf-i 

ET  CTR-^C 


0 

AT 


+  Q 


0 

AT 


0  0 
0  I 


0 

AT 


+  Q 


(O  j+C. 


(3.11) 
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If  in  addition,  Sj  is  invertible,  we  can  express  the  inverse  of  Mj  in  (3.10)  in  terms  of  its 
block  entries,  so  that 


Sj+i  =  A{E^^J^E  +  C^R-^C)-^A'^  +  Q,  (3.12) 

which  is  the  Riccati  equation  that  we  had  considered  earlier  in  [28].  In  the  nondescriptor 
case  {E  =  I),  (3.12)  reduces  to  the  standard  Riccati  equation  of  Kalman  filtering 

Sj+i  =  ^(ST^  +  C'^R-^C)-^A^  +  Q.  (3.13) 

4  Stability  and  Convergence  of  the  Descriptor  Kalman  Fil¬ 
ter 

In  this  section,  we  study  the  asymptotic  properties  of  the  descriptor  Kahnan  filter  in  the 
time-invariant  case: 


Ex{k  +1)  =  Ax{k)  +  u(k),  k  >  0 
y(k -h  1)  —  Cx(k  +  1)  +  r(k),  k>0 


(4.1) 

(4.2) 


where  matrices  E  and  A  are  1  x  n,  C  is  p  x  n,  and  u  and  r  are  zero-mean,  white,  Gaussian 
sequences  with  covariance 


M 


u(k) 

r(k) 


«(i)  Y 

r{j)  ) 


Q  S 
R 


6{k-j). 


(4.3) 


In  particular  the  results  presented  in  this  section  generalize  those  in  [23,  28,  33]  for  descriptor 
systems  and  the  usual  results  for  standard  causal  systems.  Note  that  as  in  [33]  in  our 
development  we  do  not  reqmre  that  I  =  n,  so  that  (4.1)  need  not  be  square  and  even  if  it 
is,  we  do  not  require  {E,  A}  to  define  a  regular  pencil.  In  the  context  of  viewing  (4.1)  as 
simply  providing  another  source  of  measurements,  we  see  that  it  is  quite  natural  to  remove 
this  restriction.  Also,  as  before,  we  allow  the  possibility  that  R  is  singular  as  well. 


Definition  4.1  The  system  (4-l)-(4-2)  is  called  detectable  if 


has  full 

column  rank  for  all  (s,t)  ^  (0,0)  such  that 

M> 

It  is 

called  stahilizable  if 

(  sE  -tA  Q 

-s 

sC  -S^ 

R 

has  full 

row  rank  for  all  (5,t)  7^  (0,0)  such  that  jsj 

>  |i|. 
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These  definitions  generalize  other  similar  definitions  in  the  literature  [7,  22].  Note,  for 
example,  that  these  definitions  reduce  to  the  classical  notions  of  detectability  and  stabiliz- 
ability  when  E  =  I  and  R  >  0.^ 

The  following  generalizes  the  usual  result  that  detectability  implies  the  existence  of  a 
stable  observer.  An  important  point  to  note  here,  however,  is  that  our  observer  (4.4)  is  an 
explicit  causal  system,  even  if  (4.1),  (4.2)  is  implicit,  a  direct  consequence  of  estimability. 

Theorem  4.1  Let  be  detectable,  then  there  exists  a  stable  filter 

x,{k  +  1)  -  A3X3{k)  +  Ksy{k  +  1),  x,{0)  =  xq  (4.4) 

i.e.  such  that  A,  is  a  stable  matrix  (all  eigenvalues  have  magnitude  less  than  1)  and  with 

lim  M[{x{k)  —  a:,(fc))(a:(fc)  -  a;,(A;))^]  <  oo.  (4.5) 


Proof  of  Theorem  4.1;  We  start  the  proof  by  showing  the  following  lemma: 


Lemma  4.1  Let  (4-l)-(4-2)  be  detectable.  Then  there  exists  a  left  inverse  {Lg 


LgE  +  LgC  —  I, 


such  that  LgA  is  stable. 


Lc)  of 


(4.6) 


Proof  of  Lemma  4.1:  First  note  that  the  lemma  is  trivially  true  if  C  has  full  column 
rank,  since  we  can  simply  take  Lg  =  0  and  Lg  =  a  left-inverse  of  C.  Assuming  this  is  not 
(  E  \ 

the  case  but  that  I  I  has  full  rank,  we  can  find  invertible  I  x  I  and  n  x  n  matrices  U 
and  V  such  that 

CV  =  (  0  Cz  )  (4.8) 

where  the  partitions  in  (4.7),  (4.8)  are  compatible,  I  denotes  a  square  identity  matrix,  and 
C2  has  full  column  rank.  If  we  partition  U AV  similarly  as 


UAV  = 


An  Ai2 
A21  A22 


(4.9) 


the  detectability  of  {C,  E,  A)  implies  that 


{slt)I  -  An 
— A21 


■*111  this  case  our  detectability  condition  reduces  to 


si-  A 
C 


having  full  column  rank  for  all  |s|  >  1. 


Also,  with  iZ  >  0  and  since  (4,3)  imjilies  that  Q  =  BB"^  and  S  =  BD^  for  some  matrices  B  and  D, 
stabilizability  in  this  case  corresponds  to  si  —  A  B  )  having  full  row  rank  for  |s|  >  1. 
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has  full  column  rank  for  |s/tj  >  1,  which  means  that  (An,  -A21)  is  detectable  in  the  usual 
sense.  Thus,  there  exists  a  matrix  D  such  that  An  +  DA21  is  stable.  Next,  let  F  be  any 
matrix  satisfying 

^  -DE22  \ 


FC2 


For  example,  we  can  take 


-DE22C^ 


where  C'l'  is  any  left-inverse  of  C2.  Then 


I  D 
0  0 


I  0 
0  E22 


and 


I  D 


111  -^12 


^  -k  F  (  0  02)=! 


An  -f-  DA21  Ai2  +  DA22 
0  0 


y  0  0  /  \  ^21  A22  )  y 

which  is  stable  because  An  -f-  DA\2  is  stable.  Thus  by  taking 

/  \ 

I  D 


Le 


V 

\ 

VF 


0  0 


u 


the  lemma  is  proved. 


(4.10) 

(4.11) 

(4.12) 

(4.13) 

(4.14a) 

(4.14b) 

□ 


Continuing  the  proof  of  the  theorem,  note  that  using  the  above  lemma,  we  can  express 
x{k  -|- 1)  as 

x{k  -f- 1)  =  LeAx{k)  4-  Lcy{k  +  1)  +  L^u{k)  -  Lcr{k) 


where  i-A  is  stable.  If  we  now  define 


we  can  easily  see  that 


Xg{k  +  1)  =  LeAx,{k)  +  LcVik  +  1) 


lim  M[{x{k)  -  x,{k)){x{k)  -  a:,(fc))^]  =  P, 

fc— >00 


(4.15) 

(4.16) 

(4.17) 


where  P*  is  the  unique  positive  semi-definite  solution  of  the  Lyapimov  equation 


P,  -  (P,A)P,(ieA)^  =  (  Xe  Pc  )  (^  _^r  if  )  (  )  •  (4.18) 


The  theorem  is  thus  proved. 


□ 


Detectability  alone,  of  course,  does  not  guarantee  that  the  descriptor  Kalman  filter 
converges  to  a  stable  filter.  However,  as  would  also  be  expected  from  what  we  know  for 
causal  systems,  detectability  does  tell  us  something  about  the  descriptor  algebraic  Riccati 
equation. 
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Theorem  4.2  Let  (4.1)-(4-2)  be  detectable.  Then  the  algebraic  descriptor  Riccati  equation 


F=-(0  0  /) 


/  APA^  AQ  -S  F  \  ^ 


-5^ 

ET 


R  C 
C'^  0 


/  0 
0 


(4.19) 


has  a  positive  semi-definite  solution. 


Proof:  We  prove  the  existence  of  a  positive  senii-definite  solution  P  to  (4.19)  by  showing 
that  the  descriptor  Riccati  recursion 


P,+i  =  -(0  0  /) 


/  APjA'^  +  Q  -S  E\^  /  Q 
-S^  R  C  \  0 

\  ET  Q  I  \I 


(4.20) 


with  Po  =  0  is  monotone  increasing  and  bounded.  This  of  course  implies  the  convergence  of 
Pfe,  which,  from  (4.20)  implies  that  this  limit,  which  must  be  positive  semi- definite,  satisfies 
(4.19).® 

To  see  the  boundedness  of  Pfe,  consider  the  stable  filter  (4.16)  with  ajj,(0)  =  xq.  It  is 
then  clear  that  the  associated  error  variance  matrices  Ps(A:)  converge  asymptotically  to  Pj 
the  rniique  solution  of  (4.18),  and  that  thanks  to  the  optimality  of  the  descriptor  Kalman 
filter,  Pfe  <  Ps{k). 

We  show  that  Pfe  is  monotone  increasing  by  induction.  Clearly 


Pi  >  Po  =  0. 


Now  suppose  that 


Pi  >  P 


i-i- 


Pj  is  the  estimation  error  covariance  associated  with  estimating  x{j)  based  on 
(  Ax{j  -l)  \  _  (  e\  ...  ,  (  Au{j  -  1)  -  u{j  -1)\ 


\  yU)  + 

where  the  covariance  of  ( 


IS 


r{j  -  1)  J 

APj.iA^  +  Q  -S 


r(y-l)  J-\  -S^  R  y 

is  the  estimation  error  covariance  associated  with  estimating  x{j  +  1)  based  on 


Ax{j) 

y{i  + 1) 


E 


x{j  +  1)  -I- 


Au(j)  -  u{j) 
r{j) 


(4.21) 

(4.22) 

(4.23) 
Also  Pj+i 

(4.24) 


^The  usual  argument  here  involves  a  right-hand  side  which  includes  matrix  in^'erses  for  which  we  can 
deduce  convergence  by  the  continuity  of  the  inversion  operation.  While  the  full  generalized  inverse  in  (4.20) 

f  E  \ 

is  not  unicpie,  the  lower  right-hand  block  is,  due  to  the  fact  that  I  ^  full  column  rank.  Intleed  in 

/  >?  \ 

with  H  in 


Appendix  A  we  give  an  explicit  form  for  this  block  which  involves  true  inverses  (idenlifv  ,  ^ 

\  ^ 

Appendix  A). 
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,  ,  •  e  (  ~  ^(j)  ^  •  [  APjA^  +  Q 

where  the  covariance  of  I  r{j)  /  i  S'^ 

see  that 

(  AP^A^  +  Q  -S],(  APj.a'^  +  Q 

\  -S^  R  j  -\  -S^ 

and  thanks  to  Lemma  2.5,  we  conclude  that  Pj^i  >  Pj. 


But  from  (4.22)  we 

(4.25) 

□ 


In  what  follows  it  is  useful  to  have  available  an  alternate  form  for  the  Kalman  filter 
equation  (3.4),  (3.5)  in  the  time-invariant  case.  Obtaining  this  requires  the  following  which 
is  proved  in  Appendix  A: 


Lemma  4.2  Let  R  be  a  positive  semi-definite  matrix  and  H  a  full  column  rank  matrix. 
Then 


and 


(4.27) 


Using  this  result,  it  is  straightforward  to  verify  that  the  Riccati  recursion  (4.20)  can  be 
rewritten  as 


=  {L^A)Pi{LiAf  -f  (  i,-  ^  Jr 


(4.28) 


with 


APjA'^  +  Q  -S  E\‘^ 

[ij  K j)  =  (  0  0  J  )  I  -5^  R  C 

CT  0 


/JO 
0  I 
^  0  0 


(4.29) 


Thus  we  have  that  the  descriptor  Kalman  filter  (3.4)  takes  the  form 

x{k  -I- 1)  =  LkAx{k)  4-  Kky{k  4- 1). 


(4.30) 


Via  similar  manipulations  we  can  also  rewrite  the  algebraic  Riccati  equation  (4.19)  as 


P  =  ^LA)P{LAf  +  (  i  if  )  ^  Jr  /  ) 


L'^ 


where 


(  J  K  )  =  (  0  0  J  ) 


/  APA^  4-  Q 

-S 

E  \ 

Vj 

0  \ 

-S^ 

R 

c 

0 

I 

\  E^ 

(yT 

0  } 

\  0 

0/ 

(4.31) 


(4.32) 
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In  the  following  we  consider  the  behavior  of  the  Kalman  filter  and  Riccati  equation 
when  the  system  is  both  detectable  and  stabUizable,  obtaining  a  generalization  of  well- 
known  results  for  causal  systems.  Note  that  stabilizability  implies  that 

(  E  Q  S] 

\  C  R  ) 

has  full  row  rank,  which  in  turn  implies  that 

(  E  APjA^  +  Q  S  \ 

\  C  R  ) 

has  full  row  rank  for  all  Pj  >  0,  so  that  in  this  case  the  pseudo-inverses  in  (4.29),  (4.32)  are 
in  fact  inverses. 

Theorem  4.3  Suppose  that  (4-l)-(4.2)  is  detectable  and  stabUizable.  Then  for  any  initial 
condition  Pq,  the  solution  Pk  of  the  Riccati  equation 

Pfe+i  =  {LkA)Pk{LkAf  -f  (  ifc  Kfe  )  )  (  ^1) 

/  APkA^  +  Q  -S  E\^  /  I  0  \ 

(  ife  iiTfe  )  =  (  0  0  J  )  -52^  R  CO/  (4.34) 

V  et  0  I  \  o  0  ) 

converges  exponentially  fast  to  the  unique  positive  semi-definite  sonlution  of  the  algebraic 
descriptor  Riccati  equation 

P  =  (LA)P{LAf  +  (  i  )  (  Jr  (4.36) 

/  APA^  +  0  -S  E  /  1  0\ 

(  /  /:  )  =  (  0  0  I  )  -5^  R  CO/.  (4.36) 

\  E^  c^o/voo/ 

Furthermore  the  steady-state  Kalman  filter 

x{k  Pl)  =  LAx{k) Ky{k +  1)  (4.37) 

is  stable. 

Proof  tProm  Theorem  4.2  we  know  that  there  is  at  least  one  positive  semi-definite  solution 
to  (4.35),  (4.36).  What  we  would  like  to  show  is  that  this  positive  semi-definite  solution  is 
unique,  that  Pj  in  (4.33,4.34)  converges  to  P  exponentially  fast  for  any  initial  condition  Po? 
and  that  the  resulting  steady-state  Kalman  filter  (4.37)  is  exponentially  stable,  i.e.  that 
LA  is  a  stable  matrix.  Note  that  by  using  (4.27),  it  is  not  difficult  to  show  that 

LE  =  I  -  KC.  (4.38) 
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Thus  premultiplying  (4.1)  by  L,  using  (4.2)  and  (4.37),  and  defining  x{k)  =  x{k)  -  x{k),  we 
see  that 

x{k  +  1)  =  LAx{k)  +  Lu{k)  —  Kr{k)  (4.39) 

so  that  this  will  imply  the  stability  of  the  error  dynamics. 

Let  us  first  show  that  LA  is  stable  when  P  is  taken  as  any  positive  semi-definite  solution 
of  (4.31).  Suppose  LA  is  not  stable.  Then  there  exist  a  complex  number  A  and  a  complex 
row  vector  v  such  that  iA|  >  1  and 

vLA  =  Xv.  (4.40) 

iFrom  (4.31)  we  get  that 

(1  -  =  v(l 


where  (.)^  denotes  the  conjugate-transpose.  Since  the  right  hand  side  of  (4.41)  is  non¬ 
negative  and  its  left  hand  side,  non-positive,  we  must  have 


But  from  (4.38)  and  (4.40)  we  have 

XvLE  =  vLA  -  XvKC. 
iFrom  (4.42)  and  (4.43)  it  follows  that 


(  XE- A  Q  S  \ 
^  ^  (  XC  R  ) 


(4.42) 


(4.43) 


(4.44) 


which  since  v{L  K)  ^  0  ((4.40)  implies  vL  ^  0)  contradicts  the  stabilizability  assmnption. 
Thus  LA  is  stable. 

We  can  next  show  that  there  exists  a  unique  positive  semi-definite  solution  of  (4.31). 
Specifically,  suppose  that  and  P^  are  two  such  solutions,  and  let  [L^  K^] 

denote  the  corresponding  matrices  in  (4.32).  Then  L^A  and  L^A  both  are  stable,  and,  as 


shown  in  Appendix  B 

P^  -  p2  =  {L^A){P^ 

-P^){L^Af, 

(4.45) 

so  that  iterating  (4.45) 

P^  -  p2  =  lim  {L^A)’^{P^ 

k—*oo 

-  P^)[{L^Af]’^  =  0. 

(4.46) 

Finally  we  can  show  that  Pj  converges  to  P  exponentially  fast  for  any  initial  condition 
Pq.  First  note  that  Pj*  <  Pj  where  Pj  is  the  error  covariance  for  the  problem  starting 
from  Po  =  0.  We  already  know  that  Pf  — >  P.  Thus  if  we  can  find  a  sequence  Wj  so  that 
Pj  <  Wj  and  Wj  ^  P  exponentially  fast,  we  will  be  finished.  We  accomplish  this  by  letting 
Wj  be  the  error  covariance  of  the  estimator  defined  by  the  steady-state  filter  (4.37)  for  all 
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k,  starting  with  the  same  initial  estimate  as  the  optimal  Kahnan  filter.  Thus  Wq  —  Pq  and 
Wj  >  Pj  for  all  j  >  0.  Furthermore  from  (4.39) 

Wj+i  =  (LA)Wi(lAf  +  {L  K)(  Ji.  ~/)(^)  (4.47) 

which,  thanks  to  the  stability  of  LA,  converges  exponentially  fast  to  the  tmique  positive 
semi- definite  solution  of 

W  =  (lA)W(lAf  +  (l  K)(^  "/  )  (  )  (4.48) 

and  a  comparison  to  (4.31)  yields  that  W  =  P,  completing  the  proof.  □ 

Again  we  note  that  the  results  of  this  section  represent  a  generalization  of  those  in 
[28,  33].  Furthermore  they  represent  what  is,  to  our  knowledge,  a  new  derivation  for  the 
more  frequently  studied  singular  estimation  problem  {E  =  I,  R  singular). 


5  Construction  of  the  Steady  State  Filter 


/  E 

-Q 

S  ] 

/A 

0 

0  \] 

c 

gT 

-R 

,  0 

0 

0 

Uo 

A^ 

0  J 

1  0 

F^ 

)l 

In  this  section,  we  show  that  the  solution  of  the  algebraic  descriptor  Riccati  equation  can 
be  constructed  by  using  the  eigenvectors  and  generalized  eigenvectors  of  the  pencil: 


(5.1) 


A  similar  pencil  was  also  introduced  in  [22]  for  the  study  of  the  LQ  control  problem  for 
descriptor  systems,  although  3-block  Riccati  equations  are  neither  introduced  nor  studied. 

We  shall  assume  throughout  this  section  that  the  system  is  detectable  and  stabilizable. 
The  results  we  present  here  generalize  the  usual  results  [17,  31,  32]  for  standard  causal 
{E  =  I)  systems  for  the  case  of  singular  measmrement  noise,  and  our  results  of  [23,  28]  to 
the  case  where  R  may  be  singular  and  in  addition  {E,A}  need  not  be  regular  nor  even 
square.  Before  beginning,  let  us  introduce  the  following  notation: 


K 


,  (?  = 


Q  -S 
-S^  R 


The  pencil  (5.1)  can  now  be  expressed  as 


F  -G  \  IK  0 

0  I  7  I  0 


and  the  descriptor  Riccati  equation  as 
P 

We  begin  with  the  following 


-(0  /)( 


KPK^  +  G  F 
0 


(5.2) 


(5.3) 


(5.4) 
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Lemma  6.1  The  pencil  (5.1)  is  regular  and  has  no  eigenmode  on  the  unit  circle. 

Proof:  All  we  need  to  show  is  that  for  all  z  on  the  unit  circle, 

(o 

is  invertible.  Note  that  thanks  to  the  detectability  assumption  which  can  now  be  stated  in 
terms  of  the  new  notation  as:  “sF  -  tK  has  full  column  rank  for  (s,  t)  ^  (0, 0)  and  i^l  >  ii|”, 
we  can  see  that  F  +  zK  has  full  column  rank  for  aU  z  on  the  unit  circle.  Now  suppose  that 
(5.5)  is  not  invertible,  which  means  that  there  exist  u  and  v  not  simrdtaneously  null  such 
that 

If  we  now  let 

T  =  zK  +  F,  (5.7) 

from  (5.6)  it  follows  that 

Tu-Gv  =  0  (5.8) 

T^v  =  0.  (5.9) 

If  we  now  multiply  (5.8)  and  (5.9)  on  the  left  by  and  respectively  and  take  the 
transpose-conjugate  of  (5.9)  and  subtract  from  (5.8),  we  get 

vGv^  =  0  (5.10) 

which  since  G  is  symmetric  positive  semi- definite  implies  that  Gv  =  0.  Thus  since  F  has 
full  column  rank,  (5.8)  implies  that  «  =  0.  But  we  also  have  that  v^{T  G?)  =  0  which 
thanks  to  the  stabilizability  assumption  implies  u  =  0,  contradicting  the  assmnption  that  u 
and  V  are  not  simultaneously  null.  □ 


Lemma  6.2  The  pencil  (5.1)  has  exactly  n  stable  eigenmodes. 

Proof:  Let 


.  .  F  -G  \  K  0 

p(5,i)  =  det  5  I  „  r.T 


0  KT 

/  \ 

,  ,  ,  ,0  K^\  ^  F^ 


Then 

p(t,  s)  =  det 
From 


0  \  ,  (  0  F^ 

F  -G  \  K  0 


Q  F^  ] 
I  ^  =  det 

1  =  det  i 


0  sK'^  +  tF'^ 
sF  -I-  tK  -sG 


0  tK'^  +  sF^ 


tF  +  sK 


-tG 


.(5.11) 


.  (5.12) 


J  0  \  /  0  sK'^  +  tF'^  \  (  in  0 

0  tl  )\  sF^tK  -sG  /  I  0  J 
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//  OW  0  tK^ sF'^  V  (  Ifs  0\ 

\0  si  )\tF  +  sK  -tG  j  \  0  J  j  ’ 

we  find 

^  s^+Pp{t,s)3-^, 

so  that 

t^+P-^p{s,t)  =  s^+P-^p{t,s). 


(5.13) 

(5.14) 

(5.15) 


If  we  denote  the  number  of  zero  eigenmodes  by  ^oi  stable  but  non-zero  eigenmodes  by  Sg, 
imstable  eigenmodes  by  and  infinite  eigenmodes  by  S^o,  from  (5.15)  and  the  fact  that 
there  are  no  eigemnodes  on  the  unit  circle,  we  conclude  that 


Sg  =  6y, 

^oo  -  ^0  =  /  -f  p  -  n. 

Finally  noting  that 

^0  F  ^g  Su  F  ^oo  =  n  -f-  /  -j-  p 

we  get  that  the  number  of  stable  eigenmodes 


(5.16a) 

(5.16b) 

(5.17) 

□ 


Theorem  5.1  Let  the  columns  of 


form  a  basis  for  the  eigenspace  of  the  pencil  (5.1)  associated  with  its  n  stable  eigenmodes, 

i.e. 


(5.18) 


where  J  is  stable.  Then,  P,  the  unique  positive  semi-definite  solution  of  the  algebraic  Riccati 
equation  (^.19)  is  given  by 

P  =  X{E^Yi  4-  C'^Y2)-\  (5.19) 


Proof:  Using  notation  (5.2)  and  letting 


we  must  show  that 

P  =  X{FY)-K 

To  construct  a  real  basis 


(5.20) 

(5.21) 
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and  a  real  stable  matrix  J  satisfying  (5.18),  we  need  only  to  compute  the  generalized  real 
Schur  decomposition  {[15],  p.  396) 


F 

0 

-G  ^ 
RT  j 

=  M 

K 

0 

0  ) 
ft 

1  ^ 

=  N 

(5.22a) 

(5.22b) 


of  the  pencil  (5.1),  where  Q  and  Z  are  orthogonal  matrices,  M  is  quasi-upper-triangular, 
N  is  upper  triangular,  and  where  the  n  X  n  blocks  and  iV,  in  the  decomposition 


M  = 


M,  M,s 

0  Ms 


N  = 


iV,  iV,s  \ 
0  Ns  ) 


(5.23) 


correspond  to  the  stable  eigenmodes  of  the  pencil  (5.1),  i.e.  J  =  is  stable.  Then,  if 

Z,  is  the  matrix  formed  by  the  first  n  columns  of  we  have 


iFrom  (5.18)  we  have 

FX-GY  =  KXJ 
rTy  =  F'^YJ. 


(5.24) 


(5.25) 

(5.26) 


Premultiplying  (5.25)  by  and  taking  into  account  the  transpose  of  (5.26),  we  find  that 
Y'^FX  satisfies  the  Lyapunov  equation 


Y'^FX  =  Y'^GY  +  J^Y^FXJ.  (5.27) 

Let  us  show  that  F^Y  is  invertible.  Suppose  that  F^Y  is  not  invertible,  so  that  there  exists 
w  ^  0  such  that  F'^Yw  =  0.  Then,  from  (5.27)  we  see  that 


GYw  ^  0  (5.28) 

so  that 

w'^Y^  (  jP  G  )  =  0.  (5.29) 

But  the  stabilizability  assmnption  implies  that  ^  F  G  ^  has  full  row  rank,  so  that 

Yw  =:  0.  (5.30) 

Multiplying  (5.26)  on  the  right  by  w  and  using  (5.30)  we  see  that 

F^YJw  =  0.  (5.31) 

Thus,  we  have  shown  that  the  right  null  space  of  F'^Y  is  /-invariant.  This  implies  that 
there  exists  an  eigenvector  w  7^  0  of  /  in  the  right  null  space  of  F^Y,  i.e. 

F^Yw  —  0,  Jw  =  Xw.  (5.32) 
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Multiplying  (5.25)  on  the  right  by  w  and  taking  into  account  (5.32)  gives 


(jP  -  XK)Xw  =  0. 


(5.33) 


Since  J  is  stable,  |A|  <  1.  The  detectability  assumption  then  implies  that  F  —  \K  has  full 
colunm  rank,  so  that 

Xw  =  Q.  (5.34) 

Combining  (5.30)  and  (5.34)  yields 


X 


tu  =  0 


(5.35) 


and  since  y  J  column  rank,  we  must  have  to  =  0.  This  is  a  contradiction,  so 

that  F^Y  must  be  invertible. 

Now,  if  we  solve  for  J  in  (5.26)  and  substitute  it  in  (5.25)  we  obtain 


FX  =  [G  +  KX{F'^Y)-^K'^]Y 


from  which  we  get 


G  +  KX{F'^Y)-^K'^  f  \  (  Y  \  _  (  0 

FT  0  /  I  -X  ~ 


(5.36) 


(5.37) 


This  implies 


X 


Tvwi  -(  KX{FTy)-^KT  F 


-1 


(F^Y)-^  = 


ft 


0 

I 


(5.38) 


so  that  if  P  =  X(P^F)“^,  we  have 


i>  =  -(o  j)( 


-1 


G  +  KPRT  P  \  ‘  /  0 
FT  0  /  \  J 


(5.39) 


i.e.  P  satisfies  the  algebraic  descriptor  Riccati  equation  (4.19).  Since  YT FX  solves  the 
Lyapunov  equation  (5.39),  it  is  positive  semi-definite,  so  that 


P  =  (FTY)-T{yTfX){FTy)-^ 


is  also  positive  semi-definite. 


(5.40) 

□ 


6  An  Adjusted  Estimate  to  Account  for  “Future”  Dynam¬ 
ics 

The  estimation  problem  we  have  considered  in  the  preceding  sections  involved  the  recursive 
computation  of  estimates  of  a:(fe)  based  on  dynamics  and  observations  only  in  the  past  and 
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present.  As  we  have  pointed  out,  descriptor  dynamics  allow  the  possibility  of  noncausal 
behavior  and  thus  it  edso  wotdd  seem  reasonable  to  consider  the  recursive  computation 
of  estimates  that  incorporate  futme  dynamics.  Specifically  suppose  that  we  now  define 
the  estimate  x{j)  as  the  ML  estimate  of  x{j)  based  on  the  true  measurements  (1.2)  for 
=  0, ...,  j  -  1,  the  “measurement”  (3.1)  provided  by  the  prior  information  about  a:(0),  and 
the  dynamics  (1.1)  for  all  k,  as  opposed  to  0  <  ^  <  j  -  1  as  we  did  previously.  In  the  usual 
causal  case,  i.e.  =  Uk  =  n,  E  =  I,  the  inclusion  of  these  “future”  dynamics  provide  no 
additional  information  about  x{j),  as  they  provide  no  constraints  on  x{j).  That  is,  consider 

0  =  Ej+ix{j  +  1)  -  Ajx{j)  -  u{j).  (6.1) 

If  Ej^i  =  I  (or  more  generally,  if  it  is  surjective)  then  since  x{j  +  1)  is  completely  unknown 
in  the  ML  formulation,  (6.1)  provides  no  constraint  on  x{j).  However  if  Ej^i  is  singular, 
(6.1)  does  provide  nontrivial  information  about  x{j)  (e.g.  consider  the  extreme  case  of 
Ej+i  =  0).  In  general,  of  course,  the  situation  is  even  more  complex,  since  x{j  +  1)  may 
also  be  subject  to  constraints  due  to  dynamics  farther  into  the  future.  In  the  general  time- 
varying  case  there  is  no  bound  on  how  far  into  the  future  one  must  look  in  order  to  capture 
aU  possible  dynamics.  In  particular  in  such  a  case  what  we  would  need  to  do  at  each  time  is 
to  filter  backward  the  “measurements”  corresponding  to  future  dynamics  in  order  to  obtain 
the  correct  adjustment  to  the  forward  filtered  estimate  developed  in  the  previous  sections. 
However  in  the  time-invariant  case,  the  structure  of  the  information  provided  by  the  future 
dynamics,  i.e. 

Ex{k  +  1)  =  Ax{k)  +  u{k)  k>j  (6.2) 

is  independent  of  j.  In  this  section  we  show  that  in  this  case  we  can  replace  the  effect  of 
future  dynamics  with  just  one  observation: 


Fx{j}  =  w{j)  (6.3) 

where  w  is  zero-mean,  and  F  and  the  covariance  F  of  uj  are  time-invariant.  Thus  the 
problem  is  to  find  F  and  V  in  terms  of  E,  A  and  Q  (the  covariance  of  u). 

Rewriting  (6.2)  as  a  matrix  equation  we  obtain 

/  -A 


E 

-A 


E 


-A  E 


\ 


\ 

^  »(i)  ) 

^  Mj)  ^ 

®(j  + 1) 

u(j  +  1) 

x{j  +  2) 

u(j  +  2) 

/ 

V  :  y 

\  :  / 

(6.4) 


The  relation  (6.4)  provides  some  information  not  only  about  x{j),  but  also  about  the  vectors 
x{k)  for  k  >  j,  which  are  not  directly  of  interest  and  can  be  viewed  as  exogenous  variables. 
In  order  to  isolate  the  information  about  x{j)  that  is  contained  in  (6.4),  our  first  step  will 
be  to  bring  (6.4)  to  the  form  (2.39),  so  that  Lemma  2.4  can  be  applied.  This  requires  using 
block  row  manipidations  to  eliminate  the  vectors  x{k)  with  k  >  j  from  as  many  equations 
as  we  can,  thereby  enabling  us  to  drop  the  remaining  measurements.  Specifically,  suppose 
that 

{  -A  E  \ 

-A  E 

-A  E 


To  Ti  T, 


...) 


=  (  -ToA  0  0  ...  )  .  (6.5) 


/ 
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Then,  from  (6.4),  we  get 


-  ToAx{j)  =  ^  Tiu{j  +  i)  (6.6) 

i 

which  is  of  the  form  (6.3).  So  the  problem  becomes  one  of  finding  the  highest  row  rank 
matrix  satisfying  (6.5).  We  can  rewrite  (6.5)  as 

T{z){zE  -  A)  =  -ToA  (6.7) 

where  T{z)  =  To  +  zTi  +  z^T2  +  ...  and  thus  we  need  to  find  the  polynomial  matrix  T{z) 
of  largest  rank  such  that 


T{z){zE  -  A)  =  constant  matrix.  (6.8) 

Let  us  denote  the  unknown  right-hand  side  of  (6.8)  by  T,  and  let  U{z)  and  S  be  respectively 
imimodular  and  permutation  matrices  for  which 


t7(z)(i£-A)5=  ^  "W  (6.9) 

where  N{z)  is  square  and  invertible.  Then,  if  we  denote 

FS  =  (Fi  Fa  )  (6.10a) 

T(z)U-\z)  =  (  Ti(z)  T2(z)  )  (6.10b) 

we  must  have 


which  implies  that 
or  equivalently. 


Ti(z)N(z)  =  Fi 
Ti(z)K(z)  =  F2 

K(z)N-^z)Ft  =  F2 


Constant  solutions  ^  Ti  Ta  ^  to  (6.13)  can  be  constructed  by  noting  that  if 

1  ^  - 


K{z)N-\z) 

-I 


(6.11a) 

(6.11b) 

(6.12) 

(6.13) 

(6.14) 


where  p(z)  is  a  scalar  polynomial  and  ij’s  are  constant  matrices,  then  (6.13)  is  eqtuvalent 
to 

(  -fi  Ta  ^  ^  To  Ti  ...  Lm  ^  =  0.  (6.15) 

Let  ^  jPi  jFa  ^  be  a  highest  rank  solution  to  (6.15).  Let  W  be  the  highest  rank  (full  row 
rank)  matrix  for  which 

WFiN~^{z)  =  polynomial  (6.16) 


28 


Then,  let 

Fi  =  WFi,  F2  -  WF2.  (6.17) 

We  get 

F  =  (Ft  F2)  S-^  (6.18) 

and 

T{z)  =  (  FiiV-i(z)  T2{z)  )  U{z)  (6.19) 

where  T2{z)  is  any  arbitrary  polynomial  matrix.  It  turns  out  that,  without  loss  of  generality, 
we  can  pick  T2{z)  =  0.  This  is  due  to  an  implicit  assmnption  that  was  made  throughout 
this  paper,  namely  that  the  dynamic  equations  are  consistent  for  all  possible  choices  of 
inputs  u{k),  i.e.  no  constraints  on  the  inputs  is  imposed  by  the  dynamic  equations.  It  is 
straightforward  to  verify  that  this  requires 

Left-ker  Q  D  Left-ker  [zE  —  .4],  (6.20) 

which  is  called  the  compatibility  assumption. 

To  see  why  the  compatibility  assumption  implies  that  the  choice  of  T2{z)  does  not 
matter,  simply  note  that  thanks  to  (6.20)  we  have 

(  0  Tiiz)  )  Uiz)Q  =  0.  (6.21) 

Finally,  we  get 

V 

Fx{j)  =  Y,Tiu{j  +  i)  (6.22) 

i— 1 

where  X)f=i  'Fiz'’  =  T{z).  Thus  we  obtain  (6.3)  with 

V  =  j2TiQTf.  (6.23) 

i=l 

Using  (6.3),  we  can  construct  the  “true”  or  “adjusted”  descriptor  Kalman  estimate  by 
correcting  the  result  of  the  Kalman  filter  to  incorporate  this  additional  observation.  In 
particular  using  the  methods  developed  in  the  previous  sections,  we  construct  the  optimal 
estimate  £(j)  of  a:(j)  based  on  past  dynamics  and  observations.  This  “information”  is 
completely  coded  by  the  observation 

®(i)  =  ®(i)  +  i'(j),  (6.24) 

where  t'(j)  is  a  zero-mean  Gaussian  vector  with  covariance  Fj,  where  F  -  j  satisfies  the 
Riccati  equation  described  previously.  If  we  now  add  future  dynamics,  we  have  to  find  the 
optimal  estimate  of  x(j)  based  on  the  observation 
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Since  future  dynamics  are  independent  of  past  dyneunics  and  observations,  the  new  Kalman 
estimate  x{j)  and  the  corresponding  error  covariance  Pj  are  given  by 


and 


/  Pj  0  I 

x{j)  =  (00/)  0  V  F 

\  I  0 


Pj=-{o  0  /) 


PJ 

0 

I 

0 

V 

F 

t  ^ 

I 

ft 

0 

) 

(6.26) 


(6.27) 


7  Conclusions 

In  this  paper  we  have  derived  Kahnan  filtering  recursions  for  a  general  class  of  discrete-time 
descriptor  systems,  where  the  noise  covariances  were  allowed  to  be  singular.  By  using  a 
Hamiltonian  (or  dual)  formulation  of  the  ML  estimation  problem,  the  optimal  filter  and 
the  associated  Riccati  equation  for  the  error  covariance  were  expressed  in  3-block  form.  In 
the  time-invariant  case,  the  asymptotic  behavior  of  the  optimal  filter  was  examined  and 
characterized  in  terms  of  the  corresponding  3-block  algebraic  Riccati  equation.  Finally, 
under  standard  detectability  and  stabilizability  conditions,  it  was  shown  that  the  positive 
semi-definite  solution  of  the  algebraic  Riccati  equation  could  be  obtained  by  constructing 
the  generalized  Schur  form  of  a  3-block  matrix  pencil. 

Although  we  have  focused  primarily  on  descriptor  systems,  it  is  worth  noting  that, 
because  of  the  3-block  forms  we  have  introduced,  otrr  results  already  present  a  nmnber  of 
advantages  over  existing  Kalman  filtering  techniques  for  systems  with  standard  dynamics 
[E  =  I)  but  with  singular  measurement  noise.  For  example,  in  the  absence  of  redvmdant 
perfect  information,  the  3-block  filter  and  Riccati  equations  of  Theorem  3.1  require  only 
standard  matrix  inverses,  whereas  solutions  proposed  imtil  now  require  the  use  of  pseudo¬ 
inverses  (see  [12],  section  7.4). 

One  obvious  direction  in  which  the  results  of  our  paper  can  be  extended  consists  in 
dualizing  om  results  by  considering  the  descriptor  LQ  control  problem.  Preliminary  results 
in  this  direction  appear  in  section  6  of  [30].  Other  interesting  results  for  the  descriptor 
LQ  control  problem  have  been  derived  by  Bernhard,  Grimm  and  Wang  [7],  and  Mehrmann 
[22].  Another  possible  extension  would  involve  considering  the  continuous-time  descriptor 
Kalman  filtering  problem.  Unfortunately,  the  continuous- time  version  of  the  problem  dis¬ 
cussed  here  may  not  be  completely  meaningful.  This  is  due  to  the  fact  that  unlike  the 
discrete-time  case,  where  the  singularity  of  the  system  dynamics  gives  rise  to  a  noncausal 
impulse  response,  for  continous-time  systems  the  singularity  manifests  itself  by  the  fact  that 
the  output  contains  derivatives  of  the  system  input.  Since  for  the  fitering  problem  the  input 
is  a  white  Gaussian  noise,  the  output  will  contain  white-noise  derivatives,  thereby  necessi¬ 
tating  a  formtdation  of  the  filtering  problem  in  terms  of  generalized  stochastic  processes. 
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Appendix  A:  Some  Results  on  Block  Pseudo-Inverses 


Here  we  summarize  and  specialize  several  of  the  results  in  [11]  concerning  the  generalized 
inverse  (in  the  sense  of  (2.20))  of  the  matrix 

Z=(#rf)  (A.1) 

when  H  has  full  column  rank,  i.e.  when  is  invertible.  Let 

(A.2) 

denote  any  symmetric  matrix  satisfying  (2.20),  which  in  this  case  reduces  to 


RWR  +  RUH'^  HU'^R  +  HTH^  =  R  (A.3) 

RWHArHU'^H  =  H  (A.4) 

H'^WH  =  0.  (A.5) 

In  [11]  the  following  results  are  proved: 

RWH  =  0  (A.6) 

D  =  R  —  RWR  is  imiquely  determined  by  R  and  H  (A- 7) 


HTH'^  =  -D  (A.8) 

RUH^  =  D.  (A.9) 

Note  next  that  from  (A.8)  and  the  invertibility  of  we  have  that  T  is  also  unique 

and  given  by 

T  =  (A.IO) 

Also,  from  (A.4),  (A.6)  and  the  invertibility  of 

U^H  =  =  I  (A.ll) 

proving  (4.27)  of  Lemma  4.2,  so  that  U'^  is  a  left  inverse®  of  H.  Thus  from  (2.21) 

M{xml)  =  (o  ^  X  -  X.  (A.12) 

Also  from  (2.21) 

®Note  tliat  is  not  an  arbitrary  left-inverse  of  H  as  there  is  the  additional  constraint  (A.9)  that  U 
must  satisfy. 
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(A.14) 


However  from  (A.8),  (A. 9)  and  (A.ll)  we  see  that 

Pml  =  =  -T 

proving  (2.22)  as  well  as  (4.26)  of  Leimna  4.2. 

Finally,  to  prove  identity  (2.25),  we  note  that  by  sunaming  (A.8)  and  (A.9)  and  post- 
multiplying  by  we  obtain 

RU  +  HT  =  0,  (A.15) 

which  when  combined  with  (A.ll)  gives  (2.25). 


Appendix  B:  Derivation  of  (4.45) 


Let 


Then, 


APiA'^  +  Q  -S  E 

Hi  =  I  -5^  R  C  \  ,  i=l,2. 
CT  0 


0 

pi  -  p2  =  (  0  0  /  )  (fit  _  fit)  I  0  I  . 


iProm  identity  (2.25),  there  exists  A  such  that 

/  0  \ 

0  =  fi2  A 

\l} 


so  that 


0 

112^5  I  0  I  =  fi2fi2fi2'^ 


which  using  the  property  (2.20)  of  pseudo-inverses  yields 


^  0  ^ 

0 

=: 

0 

W  J 

U/ 

Similarly  we  can  show  that 

(  0  0  J  )  fiifit  =  (  0  0  /  )  . 

Now,  by  using  identities  (B.5)  and  (B.6)  in  (B.2)  we  get 


pi  -  p2  =  (  0  0  /  )  fit(fii  -  fi2)fi|  0 


(B.l) 

(B.2) 

(B.3) 

(B.4) 

(B.5) 

(B.6) 

(B.7) 
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But 


—  O2 


Thus, 


yl(pl  _  p2^T  Q  0  \  /  I  \ 

0  0  0  =  0  A(P^-p2)A^(j  00).  (B.8) 

0  0  0  /  \  0  / 


pi  _p2 


(0  0  j)nt 


/  I\ 


{L^A){P^  -  P^){L^A^) 


A{P^  -  P^)A^ 


/ 

(  /  0  0  ) 

\ 


0 

0 

I 


/ 

(B.9) 


33 


References 

[1]  M.  B.  Adams,  A.  S.  Willsky  and  B.  C.  Levy,  “Linear  estimation  of  boundary  value 
stochastic  processes-Part  I:  the  role  and  construction  of  complementary  models,”  IEEE 
Trans.  Automat.  Control,  vol.  AC-29,  pp.  803-810,  Sept.  1984. 

[2]  B.  D.  0.  Anderson  and  J.  B.  Moore,  Optimal  Filtering.  Englewood  Cliffs,  NJ:  Prentice- 
Hall,  1979. 

[3]  W.  F.  Arnold  and  A.  J.  Laub,  “Generalized  eigenproblem  algorithms  and  software  for 
algebraic  Riccati  equations,”  Proc.  IEEE,  vol.  72,  pp.  1746-1754,  Dec.  1984. 

[4]  D.  J.  Bender  and  A.  J.  Laub,  “The  linear- quadratic  optimal  regulator  for  descriptor 
systems,”  IEEE  Trans.  Automat.  Control,  vol.  AC-32,  pp.  672-688,  Aug.  1987. 

[5]  D.  J.  Bender  and  A.  J.  Laub,  “The  linear- quadratic  optimal  regulator  for  descriptor 
systems:  Discrete-time  case,”  Automatica,  vol.  23,  pp.  71-85,  Jan.  1987. 

[6]  A.  E.  Bryson,  Jr.  and  Y.-C.  Ho,  Applied  Optimal  Control,  Waltham,  MA:  Ginn  and 
Co.,  1969. 

[7]  P.  Bernhard,  J.  Grimm  and  X.-M.  Wang,  “Commande  optimale  lineaire  quadratique 
des  systemes  implicites,”  APII,  vol.  24,  pp.  17-34,  1990. 

[8]  P.  Bernhard  and  X.-M.  Wang,  “Filtrage  des  systemes  implicites  lineaires  discrets,”  C. 
R.  Acad.  Sc.,  Paris,  t.  304,  serie  I,  pp.  351-354,  1987. 

[9]  S.  L.  Campbell,  Singular  Systems  of  Differential  Equations,  Research  Notes  in  Math., 
No.  40.  San  Francisco,  CA:  Pitman,  1980. 

[10]  S.  L.  Campbell,  Singular  Systems  of  Differential  Equations  II,  Research  Notes  in  Math., 
No.  61.  San  Francisco,  CA:  Pitman,  1982. 

[11]  S.  L.  Campbell  and  C.  D.  Meyer,  Generalized  Inverses  of  Linear  Transformations. 
London:  Pitman,  1979. 

[12]  D.  E.  Catlin,  Estimation,  Control,  and  the  Discrete  Kalman  Filter.  New  York,  NY: 
Springer  Verlag,  1989. 

[13]  L.  Dai,  “Filtering  and  LQG  problems  for  discrete-time  stochastic  singular  systems,” 
IEEE  Trans.  Automat.  Control,  vol.  34,  pp.  1105-1108,  Oct.  1989. 

[14]  A.  Emami-Naeini  and  G.  F.  Franklin,  “Deadbeat  control  and  tracking  of  discrete-time 
systems,”  IEEE  Trans.  Automat.  Control,  vol.  AC-27,  pp.  176-181,  Feb.  1982. 

[15]  G.  H.  Golub  and  C.  F.  Van  Loan,  Matrix  Computations,  2nd  ed.  Baltimore,  MD:  The 
Johns  Hopkins  Univ.  Press,  1989. 

[16]  A.  J.  Krener,  “Acausal  realization  theory.  Part  I:  Linear  deterministic  systems,”  SIAM 
J.  Control  Optimiz.,  vol.  25,  pp.  499-525,  May  1987. 


34 


[17]  V.  Kucera,  “The  discrete  Riccati  equation  of  optimal  control,”  Kybernetika,  vol.  8, 
1972. 

[18]  F.  L.  Lewis  and  V.  G.  Mertzios,  eds.,  Special  issue:  Recent  advances  in  singular  systems, 
in  Circuits,  Syst.,  and  Signal  Processing,  vol.  8,  1989. 

[19]  D.  G.  Luenberger,  “Dynamic  systems  in  descriptor  form,”  IEEE  Trans.  Automat.  Con¬ 
trol,  vol.  AC-22,  pp.  312-321,  June  1977. 

[20]  D.  G.  Luenberger,  “Time-invariant  descriptor  systems,”  Automatica,  vol.  14,  pp.  473- 
480,  Sept.  1978. 

[21]  D.  G.  Luenberger,  “Boundary  recursion  for  descriptor  variable  systems,”  IEEE  Trans. 
Automat.  Control,  vol.  AC-34,  pp.  287-292,  March  1989. 

[22]  V.  Mehrmann,  “Existence,  uniqueness,  and  stability  of  solutions  to  singtdar  linear 
quadratic  optimal  control  problems,”  Linear  Algebra  and  its  Applications,  vol.  121,  pp. 
291-331,  1989. 

[23]  R.  Nikoukhah,  “A  deterministic  and  stochastic  theory  for  two-point  boundary-value 
descriptor  systems,”  Ph.D.  dissertation.  Dept.  Elec.  Eng.  Comp.  Sci.,  Mass.  Inst.  Tech¬ 
no!.,  Cambridge,  MA,  Sept.  1988. 

[24]  R.  Nikoukhah,  M.  B.  Adams,  A.  S.  Willsky,  and  B.  C.  Levy,  “Estimation  for  boundary- 
value  descriptor  systems,”  Circuits,  Syst.,  Signal  Processing,  vol.  8,  pp.  25-48,  1989. 

[25]  R.  Nikoukhah,  B.  C.  Levy,  and  A.  S.  Willsky,  “Stability,  stochastic  stationarity,  and 
generalized  Lyapunov  equations  for  two-point  boundary-value  descriptor  systems,” 
IEEE  Trans.  Automat.  Control,  vol.  34,  pp.  1141-1152,  Nov.  1989. 

[26]  R.  Nikoukhah,  B.  C.  Levy,  and  A.  S.  Willsky,  “Smoothing  algorithms  for  boundary- 
value  descriptor  systems,”  in  preparation. 

[27]  R.  Nikoukhah,  A.  S.  Willsky,  and  B.  C.  Levy,  “Boundary-value  descriptor  systems: 
well-posedness,  reachability  and  observability,”  Int.  J.  Control,  vol.  46,  pp.  1715-1737, 
Nov.  1987. 

[28]  R.  Nikoukhah,  A.  S.  Willsky,  and  B.  C.  Levy,  “Generalized  Riccati  equations  for  two- 
point  boundary- value  descriptor  systems,”  in  Proc.  26th  IEEE  Conf.  on  Decision  and 
Control,  Los  Angeles,  CA,  Dec.  1987,  pp.  1140-1141. 

[29]  R.  Nikotikhah,  A.  S.  Willsky,  and  B.  C.  Levy,  “Reachability,  observability  and  mini¬ 
mality  for  shift -invariant  two-point  boundary- value  descriptor  systems,”  Circuits,  Syst., 
Signal  Processing,  vol.  8,  pp.  313-340,  1989. 

[30]  R.  Nikoukhah,  A.  S.  Willsky,  and  B.  C.  Levy.  “Kalman  filtering  and  Riccati  equations 
for  descriptor  systems,”  Technical  report  no.  1186,  Institut  National  de  Recherche  en 
Informatique  et  Automatique,  Rocquencourt,  France,  March  1990. 


35 


[31]  T.  Pappas,  A.  J.  Laub,  and  N.  R.  Sandell,  Jr.,  “On  the  numerical  solution  of  the 
discrete-time  algebraic  Riccati  equation,”  IEEE  Trans.  Automat.  Control,  vol.  AC-25, 
pp.  631-641,  Aug.  1980. 

[32]  P.  Van  Dooren,  “A  generalized  eigenvalue  approach  for  solving  Riccati  equations,” 
SIAM  J.  Sci.  Stat.  Comput.,  vol.  2,  pp.  121-135, 1981. 

[33]  X-M.  Wang  and  P.  Bernhard,  “Filtrage  et  lissage  des  systemes  implicites  discrets,” 
Teclinical  report  no.  1083,  Institut  National  de  Recherche  en  Informatique  et  Automa- 
tique,  Rocquencourt,  France,  Aug.  1989. 

[34]  P.  Whittle,  Prediction  and  Regulation  by  Linear  Least-square  Methods,  2nd  ed.  Min¬ 
neapolis,  MN:  University  of  Minnesota  Press,  1983. 

[35]  J.  C.  WiUems,  “A  framework  for  the  study  of  dynamical  systems,”  in  Realization  and 
Modelling  in  System  Theory,  Proc.  MTNS-89,  vol.  1,  M.  A.  Kaashoek,  J.  H.  Van 
Schuppen,  and  A.  C.  M.  Ran,  eds.,  pp.  43-60.  Boston,  MA:  Birkhauser  Verlag,  1990. 


36 


